Capstone Project - RSNA Pneumonia Detection Challenge

Problem Statement

In this capstone project, the goal is to build a pneumonia detection system, to locate the position of inflammation in an image. Tissues with sparse material, such as lungs which are full of air, do not absorb the X-rays and appear black in the image. Dense tissues such as bones absorb X-rays and appear white in the image. While we are theoretically detecting “lung opacities”, there are lung opacities that are not pneumonia related. In the data, some of these are labeled “Not Normal No Lung Opacity”. This extra third class indicates that while pneumonia was determined not to be present, there was nonetheless some type of abnormality on the image and oftentimes this finding may mimic the appearance of true pneumonia.

Dicom original images: Medical images are stored in a special format called DICOM files (*.dcm). They contain a combination of header metadata as well as underlying raw image arrays for pixel data.

Here’s the backstory and why solving this problem matters

Pneumonia accounts for over 15% of all deaths of children under 5 years old internationally. In 2015, 920,000 children under the age of 5 died from the disease. In the United States, pneumonia accounts for over 500,000 visits to emergency departments and over 50,000 deaths in 2015, keeping the ailment on the list of top 10 causes of death in the country.

While common, accurately diagnosing pneumonia is a tall order. It requires review of a chest radiograph (CXR) by highly trained specialists and confirmation through clinical history, vital signs and laboratory exams. Pneumonia usually manifests as an area or areas of increased opacity on CXR. However, the diagnosis of pneumonia on CXR is complicated because of a number of other conditions in the lungs such as fluid overload (pulmonary edema), bleeding, volume loss (atelectasis or collapse), lung cancer, or post-radiation or surgical changes. Outside of the lungs, fluid in the pleural space (pleural effusion) also appears as increased opacity on CXR. When available, comparison of CXRs of the patient taken at different time points and correlation with clinical symptoms and history are helpful in making the diagnosis.

CXRs are the most commonly performed diagnostic imaging study. A number of factors such as positioning of the patient and depth of inspiration can alter the appearance of the CXR, complicating interpretation further. In addition, clinicians are faced with reading high volumes of images every shift.

To improve the efficiency and reach of diagnostic services, the Radiological Society of North America (RSNA®) has reached out to Kaggle’s machine learning community and collaborated with the US National Institutes of Health, The Society of Thoracic Radiology, and MD.ai to develop a rich dataset for this challenge. (Description Source)

Data Source: https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/data

Import Packages

Exploratory Data Analysis (EDA)

Here as a part of EDA, following objectives are achieved:

Reading CSVs

Observations from the CSVs

Based on analysis above, some of the observations:

Reading Images

Images provided are stored in DICOM (.dcm) format which is an international standard to transmit, store, retrieve, print, process, and display medical imaging information. Digital Imaging and Communications in Medicine (DICOM) makes medical imaging information interoperable. We will make use of pydicom package here to read the images.

Observations from dicom image files

From the above sample we can see that dicom file contains some of the information that can be used for further analysis such as sex, age, body part examined (which should be mostly chest), view position and modality.

Feature extraction from the dicom image files

Above we identified some features from the dicom files that can explored/used, let's focus on the following analysis from the image files

To get the features from dicom image files, we will make use of function (get_tags).

Understanding different View Positions

As seen below, two View Positions are in the training dataset are AP (Anterior/Posterior) and PA (Posterior/Anterior). These type of X-rays are mostly used to obtain the front-view. Apart from front-view, a lateral image is usually taken to complement the front-view.

Exploring the bounding boxes for both view positions

Observations: BodyPartExamined & ViewPosition

Above we saw,

We can make use of pd.clip() to trim value to a specified lower and upper threshold. So an upper threshold of 100 in PatientAge would mean those outlier values being converted to 100.

Using Binning Method for PatientAge feature

We'll make use of a pd.cut which is 'Bin values into discrete intervals'. Use of this method is recommended when need is to segment and sort data values into bins. This function is also useful for going from a continuous variable to a categorical variable. Supports binning into an equal number of bins, or a pre-specified array of bins.

Patient Sex

Patient Age

Plotting DICOM Images

Observations: PatientAge & PatientSex

Above we saw,

Only PatientAge, PatientSex and ViewPosition are useful features from metadata.

Dropping the other features from train_class dataframe and save that as a pickle file

Check some random samples from training data

Checking some random samples as below:

Now, we will make use of custom module (eda) and function (plot_dicom_images) already imported earlier to visualize the images.

MobileNet

YOLO5

VGG19

Plotting Accuracy and Validation Accuracy

Plotting Loss and Validation Loss

Model testing

AUC Curve

Prediction

VGG16

Plotting Loss and Validation Loss

Model Testing

ROC Curve

Summary

We started with EDA of the given dataset and find how the various attributes (obtained from both the files and images) are spread across the entire dataset

After making the interim submission where we proposed a solution using MobileNet and Transfer Learning (using UNet) there were two challenges for us: First to write a function that can load the entire dataset at once and secondly Model Selection.

Model selection was a challenge as both localization and classification were needed to clubbed into one. We tried to develop a model on just ResNet, MobileNet, VGG16/19 and YOLO

After working on different models we realized that U-Net with binary classification works better for us.

Moreover, we found that U-Net is also widely used in medical applications. Hence switched to U-Net..but we encountered data imbalance problem with RCNN so we tried other models as well.

Such as mobilenet, yolo, YGG19/16

After training all the three variants we sticked with YGG19/16.

With this model we were successfully able to predict Pneumonia. Overall Single batch accuracy was 88% in VGG19 model. with good recall and precision.

Also we got single batch accuracy score of 82% in VGG16, with good recall and average precision.

So amonst all our models, VGG19 turned out to be the one with more practical advantage.